Conversation
Signed-off-by: Erik Ordentlich <eordentlich@gmail.com>
Signed-off-by: Erik Ordentlich <eordentlich@gmail.com>
Signed-off-by: Erik Ordentlich <eordentlich@gmail.com>
…pat tests Signed-off-by: Erik Ordentlich <eordentlich@gmail.com>
Greptile OverviewGreptile SummaryThis PR updates the entire spark-rapids-ml codebase to align with cuML 25.12 nightly builds, addressing multiple breaking changes in the upstream RAPIDS ecosystem. The changes include comprehensive version updates across all components (Python package, JVM artifacts, Docker images, documentation) from 25.10.0 to 25.12.0, along with infrastructure updates to handle cuML's new dependency model. Key technical changes include: adapting LogisticRegression to pass LBFGS parameters to the constructor instead of post-initialization and removing automatic objective computation; migrating KMeans from the deprecated class to KMeansMG; adding explicit installation of previously bundled dependencies (pylibraft, raft-dask) across all deployment scripts; updating model field initialization patterns for PCA, UMAP, and RandomForest to work with cuML's new internal structure; and enhancing test reliability by fixing DBSCAN tests that could fail in multi-GPU environments. The PR maintains API compatibility for end users while adapting to significant internal changes in cuML 25.12, ensuring the spark-rapids-ml library continues to provide GPU-accelerated drop-in replacements for Spark ML algorithms. Important Files Changed
Confidence score: 4/5
Sequence DiagramsequenceDiagram
participant Developer
participant "Docker Build" as Docker
participant "cuML 25.12" as cuML
participant "Logistic Regression" as LogReg
participant "Test Suite" as Tests
participant "Documentation" as Docs
Developer->>Docker: Update RAPIDS_VERSION to 25.12
Docker->>Docker: Install cuml=25.12 cuvs=25.12 pylibraft=25.12
Docker->>Docker: Update scipy minimum to 1.11
Developer->>LogReg: Handle cuML breaking changes
LogReg->>LogReg: Remove training objective computation
LogReg->>LogReg: Handle single label value cases
LogReg->>LogReg: Update lbfgs parameter handling
LogReg->>cuML: Set model fields for inference
Developer->>Tests: Fix DBSCAN multi-GPU issues
Tests->>Tests: Update test configurations
Tests->>Tests: Handle sparse data optimization
Developer->>Docs: Update version references
Docs->>Docs: Update notebook examples
Docs->>Docs: Refresh API documentation
Developer->>Tests: Run full test suite
Tests->>cuML: Validate cuML 25.12 compatibility
Tests->>LogReg: Verify logistic regression fixes
Tests->>Tests: Confirm DBSCAN stability
Tests-->>Developer: All tests passing
Developer->>Developer: Finalize PR for 25.12 release
|
There was a problem hiding this comment.
Additional Comments (3)
-
python/benchmark/databricks/gpu_cluster_spec.sh, line 58 (link)logic: The init script
init-pip-cuda-13.0-nightly.shdoes not exist in the repository. This will cause cluster creation to fail. The available script isinit-pip-cuda-12.0.sh. -
python/benchmark/databricks/gpu_etl_cluster_spec.sh, line 69 (link)logic: The init script
init-pip-cuda-12.0-nightly.shdoes not exist in the repository. This will cause cluster creation to fail. The available script isinit-pip-cuda-12.0.sh. -
python/README.md, line 25 (link)syntax: The
$RAPIDS_VERSIONshell variable is used but not defined in this context. This command won't work as documented.
32 files reviewed, 3 comments
|
build |
…bricks scripts Signed-off-by: Erik Ordentlich <eordentlich@gmail.com>
|
build |
…ckerfile Signed-off-by: Erik Ordentlich <eordentlich@gmail.com>
|
build |
There was a problem hiding this comment.
Additional Comments (2)
-
notebooks/databricks/init-pip-cuda-12.sh, line 46 (link)logic: numpy~=1.0 constraint is too broad and will install numpy 1.x which could include very old incompatible versions. Should specify a more restrictive minimum version. What specific numpy version range is required for compatibility with RAPIDS 25.12 and Databricks 13.3?
-
python/README.mdlogic: The
$RAPIDS_VERSIONvariable is referenced but not defined in the conda command. Users will get an error unless this variable is set.Should this use a specific version number like the other packages, or should there be instructions to define RAPIDS_VERSION first?
35 files reviewed, 2 comments
Signed-off-by: Erik Ordentlich <eordentlich@gmail.com>
|
build |
There was a problem hiding this comment.
Additional Comments (4)
-
python/README.md, line 25 (link)style: The $RAPIDS_VERSION variable is used but not defined in this documentation - users may be confused about what value to substitute
Should the documentation include instructions on how to set or determine the appropriate RAPIDS_VERSION value?
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
-
notebooks/databricks/init-pip-cuda-12.sh, line 46 (link)style: numpy version constraint
~=1.0is extremely permissive and may not address the scipy 1.11 compatibility issue mentioned in the PR description. Should this be a more specific numpy version constraint to ensure compatibility with scipy 1.11 requirements?Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
-
python/src/spark_rapids_ml/metrics/utils.py, line 56 (link)style: Use
is Truecomparison is unnecessary -if lr_model.getStandardization():would be more pythonicNote: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
-
python/src/spark_rapids_ml/regression.py, line 806 (link)style: Setting
n_colstwice appears redundant - line 801 and 806 both set the same attributeIs the duplicate assignment on line 806 intentional for compatibility reasons, or should one of these be removed?
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
36 files reviewed, 4 comments
Signed-off-by: Erik Ordentlich <eordentlich@gmail.com>
|
build |
There was a problem hiding this comment.
Additional Comments (4)
-
python/benchmark/databricks/cpu_cluster_spec.sh, line 27 (link)style: The spot_bid_price_percent parameter is now unused since availability is ON_DEMAND
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
-
notebooks/databricks/init-pip-cuda-12.sh, line 45 (link)logic: numpy version constraint ~=1.0 seems overly permissive and may allow incompatible versions. Should this be a more restrictive constraint like numpy>=1.21,<2.0 to ensure compatibility with the RAPIDS ecosystem?
-
python/src/spark_rapids_ml/regression.py, line 809 (link)style: Redundant assignment of lr.n_cols since it was already set on line 804
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
-
python/src/spark_rapids_ml/classification.py, line 1074-1091 (link)style: Exception handling properly catches cuML's new single-label restriction, but the string matching approach using
traceback.format_exc()is fragile and could break with cuML error message changesNote: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
36 files reviewed, 4 comments
default CUDA version has been updated in #997 Update image tag to use latest docker image to run CI. Signed-off-by: YanxuanLiu <yanxuanl@nvidia.com>
mainly attempts to align with some breaking cuml changes, including